Goto

Collaborating Authors

 alternative explanation




Are Emergent Abilities of Large Language Models a Mirage?

Neural Information Processing Systems

Recent work claims that large language models display \textit{emergent abilities}, abilities not present in smaller-scale models that are present in larger-scale models.What makes emergent abilities intriguing is two-fold: their \textit{sharpness}, transitioning seemingly instantaneously from not present to present, and their \textit{unpredictability}, appearing at seemingly unforeseeable model scales.Here, we present an alternative explanation for emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due the researcher's choice of metric rather than due to fundamental changes in model behavior with scale. Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continuous metrics produce smooth, continuous, predictable changes in model performance.We present our alternative explanation in a simple mathematical model, then test it in three complementary ways: we (1) make, test and confirm three predictions on the effect of metric choice using the InstructGPT/GPT-3 family on tasks with claimed emergent abilities, (2) make, test and confirm two predictions about metric choices in a meta-analysis of emergent abilities on BIG-Bench; and (3) show how to choose metrics to produce never-before-seen seemingly emergent abilities in multiple vision tasks across diverse deep networks.Via all three analyses, we provide evidence that alleged emergent abilities evaporate with different metrics or with better statistics, and may not be a fundamental property of scaling AI models.


adc98a266f45005c403b8311ca7e8bd7-Paper-Conference.pdf

Neural Information Processing Systems

Here, we present an alternative explanation for emergent abilities: for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due to the researcher's choice of metric rather than due to fundamental changes in models with scale.


SCRN escape saddle-points and converge to local minimizers faster under Strong Growth Condition (SGC) (which

Neural Information Processing Systems

We thank all the reviewers for their valuable comments. Prior works (e.g., [VBS18]) considered only convergence to critical We provide our results in both the zeroth and higher order settings. SGC assumption for unbounded functions, which was not done before in the literature. SCRN is also significantly involved under SGC (especially in zeroth-order setup); see also Remark 6 and 7. Please see Lines 2-10 above. However, the method in [AL18] is a theoretical computer science style reduction approach.


Are Emergent Abilities of Large Language Models a Mirage?

Neural Information Processing Systems

Recent work claims that large language models display \textit{emergent abilities}, abilities not present in smaller-scale models that are present in larger-scale models.What makes emergent abilities intriguing is two-fold: their \textit{sharpness}, transitioning seemingly instantaneously from not present to present, and their \textit{unpredictability}, appearing at seemingly unforeseeable model scales.Here, we present an alternative explanation for emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due the researcher's choice of metric rather than due to fundamental changes in model behavior with scale. Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continuous metrics produce smooth, continuous, predictable changes in model performance.We present our alternative explanation in a simple mathematical model, then test it in three complementary ways: we (1) make, test and confirm three predictions on the effect of metric choice using the InstructGPT/GPT-3 family on tasks with claimed emergent abilities, (2) make, test and confirm two predictions about metric choices in a meta-analysis of emergent abilities on BIG-Bench; and (3) show how to choose metrics to produce never-before-seen seemingly emergent abilities in multiple vision tasks across diverse deep networks.Via all three analyses, we provide evidence that alleged emergent abilities evaporate with different metrics or with better statistics, and may not be a fundamental property of scaling AI models.


Ensured: Explanations for Decreasing the Epistemic Uncertainty in Predictions

Löfström, Helena, Löfström, Tuwe, Szabadvary, Johan Hallberg

arXiv.org Artificial Intelligence

This paper addresses a significant gap in explainable AI: the necessity of interpreting epistemic uncertainty in model explanations. Although current methods mainly focus on explaining predictions, with some including uncertainty, they fail to provide guidance on how to reduce the inherent uncertainty in these predictions. To overcome this challenge, we introduce new types of explanations that specifically target epistemic uncertainty. These include ensured explanations, which highlight feature modifications that can reduce uncertainty, and categorisation of uncertain explanations counter-potential, semi-potential, and super-potential which explore alternative scenarios. Our work emphasises that epistemic uncertainty adds a crucial dimension to explanation quality, demanding evaluation based not only on prediction probability but also on uncertainty reduction. We introduce a new metric, ensured ranking, designed to help users identify the most reliable explanations by balancing trade-offs between uncertainty, probability, and competing alternative explanations. Furthermore, we extend the Calibrated Explanations method, incorporating tools that visualise how changes in feature values impact epistemic uncertainty. This enhancement provides deeper insights into model behaviour, promoting increased interpretability and appropriate trust in scenarios involving uncertain predictions.


How We Refute Claims: Automatic Fact-Checking through Flaw Identification and Explanation

Kao, Wei-Yu, Yen, An-Zi

arXiv.org Artificial Intelligence

Automated fact-checking is a crucial task in the governance of internet content. Although various studies utilize advanced models to tackle this issue, a significant gap persists in addressing complex real-world rumors and deceptive claims. To address this challenge, this paper explores the novel task of flaw-oriented fact-checking, including aspect generation and flaw identification. We also introduce RefuteClaim, a new framework designed specifically for this task. Given the absence of an existing dataset, we present FlawCheck, a dataset created by extracting and transforming insights from expert reviews into relevant aspects and identified flaws. The experimental results underscore the efficacy of RefuteClaim, particularly in classifying and elucidating false claims.


#NeurIPS2023 outstanding papers

AIHub

The thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023) is underway in New Orleans. At the official opening session of the conference on Monday evening, the outstanding papers were announced. The awards comprised two outstanding main track paper awards, two outstanding main track runner-ups, two outstanding datasets and benchmark track papers, and the annual test of time award. Abstract: We propose a scheme for auditing differentially private machine learning systems with a single training run. This exploits the parallelism of being able to add or remove multiple training examples independently.


Are Emergent Abilities of Large Language Models a Mirage?

Schaeffer, Rylan, Miranda, Brando, Koyejo, Sanmi

arXiv.org Artificial Intelligence

Recent work claims that large language models display emergent abilities, abilities not present in smaller-scale models that are present in larger-scale models. What makes emergent abilities intriguing is two-fold: their sharpness, transitioning seemingly instantaneously from not present to present, and their unpredictability, appearing at seemingly unforeseeable model scales. Here, we present an alternative explanation for emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due to the researcher's choice of metric rather than due to fundamental changes in model behavior with scale. Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continuous metrics produce smooth, continuous predictable changes in model performance. We present our alternative explanation in a simple mathematical model, then test it in three complementary ways: we (1) make, test and confirm three predictions on the effect of metric choice using the InstructGPT/GPT-3 family on tasks with claimed emergent abilities; (2) make, test and confirm two predictions about metric choices in a meta-analysis of emergent abilities on BIG-Bench; and (3) show to choose metrics to produce never-before-seen seemingly emergent abilities in multiple vision tasks across diverse deep networks. Via all three analyses, we provide evidence that alleged emergent abilities evaporate with different metrics or with better statistics, and may not be a fundamental property of scaling AI models.